Goto

Collaborating Authors

 Oswego County


The Viral 'DoorDash Girl' Saga Unearthed a Nightmare for Black Creators

WIRED

A delivery driver posted a TikTok alleging she had been sexually assaulted by a customer. The deepfakes that followed reveal a growing digital blackface problem. When DoorDash delivery driver Livie Rose Henderson posted a video alleging that one of her customers sexually assaulted her in October, it set off a firestorm of reactions. Henderson's TikTok claimed that when she was dropping off a delivery in Oswego, New York, she found a customer's front door wide open and inside, a man on the couch with his pants and underwear pulled down to his ankles. Henderson was dubbed the "DoorDash Girl," and her video accrued tens of millions of views, including some supportive and consoling responses to what she said she had endured on the job as a young woman.


A 100 Billion Chip Project Forced a 91-Year-Old Woman From Her Home

WIRED

Azalia King was the last holdout preventing the construction of a Micron megafab. Onondaga County authorities threatened to use eminent domain to take her home away by force. Azalia King moved into an upstate New York home surrounded by sprawling cattle pastures around 1965, about the time that mass production of the world's first microchips began. Now, 60 years later, the 91-year-old is on the verge of losing her home to make way for what could become the largest chipmaking complex in the US. Local authorities threatened to exercise their power of eminent domain, or taking land for public benefit, to forcibly uproot King and proceed with construction on a $100 billion campus where US tech giant Micron plans to make memory chips for use in a variety of electronics.


Synthetic Data-Driven Prompt Tuning for Financial QA over Tables and Documents

Yu, Yaoning, Chang, Kai-Min, Yu, Ye, Wei, Kai, Luo, Haojing, Wang, Haohan

arXiv.org Artificial Intelligence

Financial documents like earning reports or balance sheets often involve long tables and multi-page reports. Large language models have become a new tool to help numerical reasoning and understanding these documents. However, prompt quality can have a major effect on how well LLMs perform these financial reasoning tasks. Most current methods tune prompts on fixed datasets of financial text or tabular data, which limits their ability to adapt to new question types or document structures, or they involve costly and manually labeled/curated dataset to help build the prompts. We introduce a self-improving prompt framework driven by data-augmented optimization. In this closed-loop process, we generate synthetic financial tables and document excerpts, verify their correctness and robustness, and then update the prompt based on the results. Specifically, our framework combines a synthetic data generator with verifiers and a prompt optimizer, where the generator produces new examples that exposes weaknesses in the current prompt, the verifiers check the validity and robustness of the produced examples, and the optimizer incrementally refines the prompt in response. By iterating these steps in a feedback cycle, our method steadily improves prompt accuracy on financial reasoning tasks without needing external labels. Evaluation on DocMath-Eval benchmark demonstrates that our system achieves higher performance in both accuracy and robustness than standard prompt methods, underscoring the value of incorporating synthetic data generation into prompt learning for financial applications.


Fine-resolution landscape-scale biomass mapping using a spatiotemporal patchwork of LiDAR coverages

Johnson, Lucas K., Mahoney, Michael J., Bevilacqua, Eddie, Stehman, Stephen V., Domke, Grant, Beier, Colin M.

arXiv.org Artificial Intelligence

Estimating forest AGB at large scales and fine spatial resolutions has become increasingly important for greenhouse gas accounting, monitoring, and verification efforts to mitigate climate change. Airborne LiDAR is highly valuable for modeling attributes of forest structure including AGB, yet most LiDAR collections take place at local or regional scales covering irregular, non-contiguous footprints, resulting in a patchwork of different landscape segments at various points in time. Here, as part of a statewide forest carbon assessment for New York State (USA), we addressed common obstacles in leveraging a LiDAR patchwork for AGB mapping at landscape scales, including selection of training data, the investigation of regional or coverage specific patterns in prediction error, and map agreement with field inventory across multiple scales. Three machine learning algorithms and an ensemble model were trained with FIA field measurements, airborne LiDAR, and topographic, climatic and cadastral geodata. Using a strict set of plot selection criteria, 801 FIA plots were selected with co-located point clouds drawn from a patchwork of 17 leaf-off LiDAR coverages (2014-2019). Our ensemble model was used to produce 30 m AGB prediction surfaces within a predictor-defined area of applicability (98% of LiDAR coverage), and the resulting AGB maps were compared with FIA plot-level and areal estimates at multiple scales of aggregation. Our model was overall accurate (% RMSE 22-45%; MAE 11.6-29.4 Mg ha$^{-1}$; ME 2.4-6.3 Mg ha$^{-1}$), explained 73-80% of field-observed variation, and yielded estimates that were consistent with FIA's design-based estimates (89% of estimates within FIA's 95% CI). We share practical solutions to challenges faced in using spatiotemporal patchworks of LiDAR to meet growing needs for AGB mapping in support of applications in forest carbon accounting and ecosystem.


Go With the Flow, on Jupiter and Snow. Coherence From Model-Free Video Data without Trajectories

AlMomani, Abd AlRahman, Bollt, Erik M.

arXiv.org Machine Learning

Viewing a data set such as the clouds of Jupiter, coherence is readily apparent to human observers, especially the Great Red Spot, but also other great storms and persistent structures. There are now many different definitions and perspectives mathematically describing coherent structures, but we will take an image processing perspective here. We describe an image processing perspective inference of coherent sets from a fluidic system directly from image data, without attempting to first model underlying flow fields, related to a concept in image processing called motion tracking. In contrast to standard spectral methods for image processing which are generally related to a symmetric affinity matrix, leading to standard spectral graph theory, we need a not symmetric affinity which arises naturally from the underlying arrow of time. We develop an anisotropic, directed diffusion operator corresponding to flow on a directed graph, from a directed affinity matrix developed with coherence in mind, and corresponding spectral graph theory from the graph Laplacian. Our methodology is not offered as more accurate than other traditional methods of finding coherent sets, but rather our approach works with alternative kinds of data sets, in the absence of vector field. Our examples will include partitioning the weather and cloud structures of Jupiter, and a local to Potsdam, N.Y. lake-effect snow event on Earth, as well as the benchmark test double-gyre system.